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THE PROBABILITY OF THE ARITHMETIC MEAN COMPARED WITH 

THAT OF CERTAIN OTHER FUNCTIONS OF THE 

MEASUREMENTS. 

Bt Edwabd L. Dodd. 

1. Introduction. 

In his Theorie der Beobachtungsfehler, Czuber has exhibited many of 
the attempts made to unite the principle of the arithmetic mean as the 
" most probable value " with the Gaussian probabiUty law. He* quotes 
from Bertrandf who gives an example to show that this law and principle 
are not strictly compatible. It is one object of- this paper to show this 
incompatibility by other methods, — ^to exhibit functions of the measure- 
ments to which the Gaussian law assigns a greater probability than it 
assigns to the arithmetic mean. J 

Wrapt up with the Gaussian law are several assumptions. Of these, 
we note the following in particular. 

1. A true value, a, exists for the unknown. § 

2. Associated with a measvu-ement, m, or with a set of n measurements 
taken under similar circumstances, there exists a positive constant, h, 
called the measm"e|! of precision. 

3. An objective 1[ or physical probability may exist, when its value is 
vmknown or but " approximately " known. 

For example: if an urn contains just n balls of which w are white, the 

* Loc. cit., p. 51. 

t Calcul des Probability (1889), p. 180. 

t Sets of axioms have been proposed to ground the principle of the arithmetic mean as the 
" most probable value." See Czuber, loc. cit., p. 16-47; also Schimmack, Mathematische Annalen, 
68. Band, p. 125. These axioms are, in general, of such a nature that they may be used equally 
well to ground the principle that the arithmetic mean is the least probable value. 

§ This assimiption and the next one presuppose that a unit of measure has been adopted. 
In changing from meters to centimeters, a would be multiplied by 100, h divided by 100, but ha 
would be invariant. 

II Here we merely set up the " assumption " or " axiom " that h exists. A commonly ac- 
cepted approximation for h is -v / „ . , in which S t-* is the sxun of the squares of the residuals 

of the measurements; i. e., vi = M — mi, M being the arithmetic mean. See Czuber, Wahr- 
scheinlichkeitsrechnung, I, p. 281. 

fFor the view that probability is "purely subjective," see Sigwart's Logic, trans, by Dendy, 
vol. 2, p. 224. For a distinction between objective probability and subjective probability, see 
Kries, " Die Prinzipien der WahrscheinUchkeitsrechnung (1836), p. 95. Reference is made to the 
distinction in the Encyklopadie (I, D. 1), p. 735. 
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probability of its delivering a white ball is win, — ^whether anyone knows 
what n and w are, or not. Note that /i in 2. is unknown, as well as o in 1. 
The Gaussian probabiUty law then states that 

h r" 

is the probability that the error, a; = a — m, will he between x' and x", 
where x' ^ x". The following, also, are important assumptions underlying 
(1), or inferences from (1), according to the viewpoint. 

4. The probabiHty, P, is a fimction of h, x', and x", but not of a* 
This does not prevent the probabiHty of the error of certain functions of 

the measurements being functions of a, h, x' and x". 

5. P is never zero when x' is less than x". Roughly speaking: "Any 
error is possible." 

6. P is zero if x' equals x". " The probability of any particular error 
is zero." 

7. P is unchanged, if — x", — x' replace x', x" . " The probability 
of a negative error is the same as that of the corresponding positive error." 

8. P is greater for the interval {—a, + a) than for any other interval 
of length, 2a. " Zero is the most probable error." 

9. P is unity for the interval (— «>, + <»). Thus the probabiUty for 
very large errors is very small. 

These assumptions, in general, are mathematical refinements, or ab- 
stractions, — somewhat comparable with the conception in geometry of a 
line with no breadth or thickness. However, it is not the object of this 
paper to defend these assumptions or to prove the Gaussian law, but to 
investigate its consequences. 

From 6., it follows that under the Gaussian law, there is, strictly speaking, 
no most probable error. Each of the infinite number (see 5.) of possible 
errors has the same probabihty, zero.f For the discussion of the relative 
magnitude of probabilities, some definition is needed. Corresponding to 
each function, /, of the measurements, there is an error, a — f. When / 
Hes in the interval from a — a to a + a, its error Ues in the interval from 
— a to a. 

Definition. — The probability of fi will be said to be greater than that of f^, 
if the probability that the error of /i willX lie in the interval from — a to + a 

• See Bertrand, loc. cit., p. 177. 

t Similarly, the probability, a posteriori, that any particiilar real number be the true value 
is zero, — according to standard treatments. For example, see Poincar6, Cakul des Probabilit^s 
(1896), p. 149. Set da = in the numerator. There is here, then, no " most probable value " 
for the unknown true value. 

t The futmre tense. By " measurements," as discussed here, will be meant contemplated 
measurements. 
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is greater than the probability thai the error of fi will lie in the same interval, 
for all positive values of a less than some a', — in other words, if the proba- 
bility that fi vrill differ from a by less than a is greater than that fi iviU differ 
from a by less than a, when a < a'. 

Some such restriction as that imposed upon a is needed to avoid anom- 
malies. For example: if a is taken equal to a, the probability of bm is 
greater than that of m, where b is any proper fraction. For bm will lie in 
the interval, (a — a, a + a) = (0, 2a), whenever m lies in (0, 2a/b). 
Furthermore, it is natviral to make a small; for, in general, the error of each 
measvtrement will be small in comparison with a. But a cannot be made 
zero; for .the probabihty that/i or/z will equal a is zero.* For example: 
the probability that the average of four measurements will differ from a by 
less than a, is found by replacing h by 2h, and x', x", by — a, a in (1). 
This probability vanishes with a. 

2. The Comparative Probabilities of the Arithmetic Mean of Measurements and 
the Square Root of the Mean of Their Squares. 

Under the Gaussian law (1), it follows that the probabihty, P., that 
the average, M, of two measurements will he in the interval, {a— a, a+a), 
is given by _ 

Pa = ^ re-'>^'^dx = 4- r ''^e-''dt = ei^ha). (2) 



This is not a new theorem. It may be proved as follows, f Let the error 
of mi be X, and that of m^ be y. P^ is then the probability that 

. mi + mz 

— a S a n ^ oc, 

or that 

-2a < x + y <2a, 

(3) 

— 2a — X ^ y ^ 2a — X. 

The first error, x, may be of any magnitude; but the second, y, must 
then satisfy (3). Hence, 

h r" .. , , h r^-' 






* Except for some trivial function, as Om + o. 

t The proof given here is regarded as simpler than a proof using a non-convergent " dis- 
continuity factor," such as 



I cos {iiB)de or I 

1/— CO «/— « 



'd». 



Likewise, it may be proved that the probability that the error, E, of M will lie in (0, o)_is 
\Q{V2ha). Thus the probability that E will lie in {x', x") is pvai by (1) with h changed to V2h. 
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The field of integration, S, is bounded by two parallel lines, and its 
width is 2 V2a. If the axes are rotated through 45°, the integrand is un- 
changed, but the boundaries of the field become 
parallel to the Y axis, at a distance of V2a from it. 
This gives (2). 

Now let 



4 



Iwi^ + mj^ + • • • + Wn^ 
n 




be caUed the root-mean-square of the n measurements. 

Theorem 1. Under the Gaitssian probability law, the root-mean-square 
of two measurements has a greater probability than their arithmetic mean, 
provided the product of the precision constant by the true value is greater than 2. 

Proof. — ^The condition mentioned is 

ha > 2. (4) 

Now h is positive, and thus a is also. 
Let P.' be the probability that 



a — a -^ ^ — —^ ^a -\- a. 



The errors being x and y as before, x and y must satisfy 

2(a - of ^{x- aY + (2/ - aY £ 2(a -\- aY; (5) 

in this, a is to be taken less than a. The point, {x, y), is then confined to 
an aimular region. A, botmded by two concentric circles, with centers at 
(o, o) and radii, -^{a — a) and V2(a + a), respectively. The width of 
the ring is 2 ■yl2a. 



Then 



P^' =^J J e-*=(^^«^>dxd2/, (6) 



whereas P„ is a like integral taken over the strip, S, of the same width, 
2 -^a. If the integrand is set equal to z, the locus is a surface of revolution 
about the Z axis, and thus the integrand takes the same value for every 
point on a circle, C, of radius, r, in the XY plane, centered at the origin. 
Consider now the evaluation of P/ and P., imder a transformation to polar 
coordinates. If _ 

V2a < r < V2(2o - a), 

arcs of the circle, C, will be intercepted between the boundaries of S and 
also of A ; but the latter arcs will be the greater, because their chords are 
greater, — a straight line segment joining a point on one boimdary of A 
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to a point on the other boundary will be greater than 2 ■>/2a, unless it is 
a portion of the radius of the outer circle. Now, usually the integrand in 
(6) will become inappreciable long before r reaches 2 V2a; usually, errors 
double the true value are well-nigh impossible. The infinite portions of S 
contribute next to nothing — ^when ha > 2 — ^to the integral, P.; whereas 
PJ outstrips Pa in the stretch from to 2 V2a. To get a numerical rela- 
tion between P. and P/, we may proceed as follows: 

Let the axes be rotated through 45°; and set B = w^, S = a->l2. 
Let the intersections of C with the boimdaries of A in the first quadrant 
be (xi, yi) and (xz, y^ when 

5 < r < 2P - 6. (7) 

Find the value of ?/2 — yi and to it apply the inequality, 

Vc + d - Vc - d > df4c, if < a' < c. 

Then, if D is the length of the chord, 

462 



D'> 






r r^ - 52 
I> > 26 1 -h 



8P2 



Then if is the angle formed by straight lines from the origin to (xi, j/i) 
and (x2, 2/2), respectively. 

Now, if (7) did not need to be satisfied, it would be found upon passing 
to polar coordinates, and using 

that P/ would— when 6 is small— be greater than 



28h 



[l+32kO- ^'^ 



But, because (8) was obtained from (7) with r < 2P — 5, a deduction 
must be made from the bracket in (9) of an amount not greater than 
3/[80(/io)^], together with an amount which approaches zero with 6, — 
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because of the inequality, 5 < r, in (7). The upper limit mentioned for 
the former deduction is obtained by using the inequalities, 

But 

P <?^- 

and hence ji ha > 2, and 5 sufficiently small, 

f.'>^-+f[8o(b]- QE-D. 

It is not to be supposed that 2 is the critical value for ha. But it is 
evident from (6) that, for a given a, it would be possible to choose an h 
so small that the integrand in (6) would be sensibly unity throughout A 
and the nearer portions of S. The remote portions of S would then make 
Pa greater than P.'. 

Theorem 2. Under the Gaiissian law, the probability of the arithmetic 
mean of three measurements is greater than the probability of their root-mean- 
square. 

Proof. — In this case, the probabiUty, P., that the arithmetic mean will 
differ from the true value by less than a, is 9(V3/iq:). It may be found 
as a triple integral over a region bounded by parallel planes, each at a distance, 
V3a, from the origin, — analogous to S in the figure. Likewise, for three 
measurements, a being positive, P„' is the integral over a region between two 
concentric spheres, centered at (a, a, a), and tangent to the two planes 
bounding the new S. The zones of the sphere, C, cut off by the regions, 
S and A, have the same altitude and thus the same area. On these zones 
— ^for fixed r — ^the integrand is a constant, and each of the two zones gives 
the same integral. But as r goes from zero toward infinity, the region A 
becomes exhausted, whereas /S does not. Hence PJ < Pa- But, in general, 
their difference would be inappreciable. No condition, such as o > 0, 
is needed in dealing with three measurements. For, in case a = 0, the 
region A becomes a sphere lying in S. And if a < 0, PJ is zero for a small 
a, — ^if the usual convention, giving to the radical the positive sign, is 
adopted. But if the radical is to be made always negative, the treatment 
is essentially that for a > 0. 

When n is very large, a certain presumption exists that the arithmetic 
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mean, M, has a greater probability than the root-mean-square, M'. For 
if M is positive, and Vi, Vi, • • • are the residuals, then 



^' = ^[i+^]'- 



Now the arithmetic mean is subject to the Gaussian law with precision 
constant, Vn/i. And it will be proved presently that the probability of 
hM is less than that of M, when & is a constant greater than unity. When 
n is very large, the bracket above is supposed to be sensibly a constant, and 
it is greater than unity. 

That the arithmetic mean, M, of n measvirements, — each subject to 
the Gaussian law with precision, h, — is subject to this law, with precision 
■^nh, may be proved as follows. The condition that the error of M shall 
lie in (— a, a), is equivalent to the condition that the simi, Sx, of the errors 
of the measurements shall Ue in {— na, no). The probabiUty for this is 
an n-fold integral taken over a region bounded by two " parallel planes " 
in n dimensions. Each plane is at a " distance," Vna from the origin. 
By an orthogonal transformation — " rotation " — the planes can be made 
" perpendicular " to an " axis." The following is such a transformation: 

xa Vn Vn -vn. 



X, 


= 


Xi 


Xi 

<2 


> 








X, 
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Xi 

V2-3 


: + ■ 


Xi 

V2. 
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2X: 




Xr 


«. 


Xi 

— 




+ - 


r-r 


Xi 


+ 



Xr-i (r — l)a;r 

V(r — l)r ^ V(r — l)r ^ V(r"— l)r V(r — l)r' 

where 2 ^ r ^ n. Then Xi = Zxl4n, and Ues in (— Vna, 4na). 

3. The Comparative Probabilities of m and bm, and also of M and bM. 

Since the average, M, of n measiu-ements, is subject to the Gaussian 
probability law, with precision constant, Vn/i, a comparison of the 
probabilities of m and hm — ^where 5 is a constant — ^is likewise a comparison 
of the probabilities of M and bM, — ^with the proper change from h to -ylnh. 

Theorem 3. Under the Gaussian law, the probability ofbmis less than 
that of m, if bis a constant greater than unity; bvi there exist positive values of 
the constant b for which the probability of bm is greater than that of m. 
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Proof. — ^Let P. be the probability that m will differ from a by less than 
a, and let P.' be the probabiUty that 

a — a ■^ hm -^ a -{• a, 
or 

a + a. . a — a 

a •\- a ^ ^ a — a 

a i: — < a — m < a r — . 

~ — 

By hypothesis, this probability is 

■ST'Jx' 

where 

, a + a „ a — a 
X = a 1 — , X — a T — . 



(10) 



The interval of integration, is, in length, 2a /b. In the special case, where 

a = 0, P.' > P. if < & < 1; but P/ < P, if 5 > 1. Also, in aU other 

cases, 

P.' < P., if & > 1. 

For the interval of integration for PJ is smaller, and is less favorably situ- 
ated — ^being centered at a — a/h. Likewise, when < & < 1, the center 
of the interval is at o — a/b; but the length of the interval is greater than 2a, 
the interval for P.. It will now be shown that if 6 is taken sufficiently 
close to imity — thus bringing the center of the interval close to the origin 
— the advantage which P.' has in length of interval, outweighs the disad- 
vantage in position; i. e., PJ > Pa- 

Now the integrand is greatest when x = 0. Hence 

P <^ 

On the other hand, if a > 0, the integrand for P„' is least when x = a 
- (o 4- a)/b. Thus 

p , 2a _n_ ft2|;<,(i_j)+a]S,j2 

^' ^ b ^r^ 
Hence PJ > P„ provided b and a can be so taken that 

b ^ ' 

that is, 

A2 1 

^Ja(l-b)+aP<log,^. 
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Take a ^ a(l — h), and let 1/fe = 1 4- y. It is then to be shown that y 
can be taken positive but small enough so that 

^^aY < loge (1 + J/). 
But when y is sufficiently small, 

maY <y-^< loge (1+2/). Q.E.D. 



The proof for the case, o < 0, is essentially the same. 

Example. — ^If the Gaussian law be assumed, and if it be admitted that 
of two values for the unknown it is better to accept that which has the 
greater probabiUty, in accordance with the definition adopted in this 
paper, then the foregoing theorem implies that if a meter bar is measured 
in inches with the result, 39.37, it is better to accept for the length of the 
bar some number a little less than 39.37 than to accept 39.37 itself. This 
applies whether 39.37 is a single measiirement or is the arithmetic mean of 
a set of measurements taken under the same circvunstances, — ^for the arith- 
metic mean is subject to the Gaussian law when the individual measure- 
ments are thus subject. 

But the difference of 39.37 and such a number is so small — ^if the measure- 
ments have been made with even a moderate degree of accuracy — ^as to 
be negligible. This may be seen as follows. With a small, the integral 
(10), as a function of b, has a maximum for approximately 

^ = 1-2^' (11) 

Furthermore, P/ > P. if 

and a is small enough. Or, if the arithmetic mean of n measurements is 
used, then in place of (12), we have 

1>^>1-2»T3- (13) 

But, ordinarily, h'^a^ is very large, and this multiplier, h, to be used upon 
m or upon the arithmetic mean, M, does not differ appreciably from imity. 
Indeed, in the case of the meter bar, it may be reasonably certain that 
a > 39. Now the commonly accepted formula* connecting h with the 
so-called " probable error," r, is 

hr = .476936. 



' Czuber, Wahrscheinlichkeitsrechnung, I, p. 270. 
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To take r = .01 signifies that we suppose it as likely that the error of a 
measurement will be numerically less than a hundredth of an inch as that 
it will exceed that amount. In this case, ^ > 47; and, hence, 

/iW > 3,000,000. 

A modification of the foregoing theorem consists in the use of a function, 
hm-^- c, with 6 < 1 and c > 0, — ^provided o > 0. The c does not affect 
the length of the interval of integration, but merely its position. With h 
fixed, there exist positive values of c which move the interval toward the 
origin and thus augment the probability. . 

Theorem 4. Under the Gaussian 'probability law, the prohahility of 
m + cis less than that of m, for every value of the constant, c, not zero. 

The proof of this theorem follows at once from considerations given 
above. 

4. The Probability of the Median. 

By the median of 2i' + 1 measurements is meant the middle or (j'+l)th 
measurement when they are arranged according to magnitude. If there 
are three measurements, the first or second or third measurement made 
— ^in order of time — ^may be the median. If the first is the median, the 
measurement less than the median may be the second, or it may be the 
third. The probabihty that the error of mi will lie in (—a, a) is very 
nearly 2/ia/Vjr when a is small; the probabihty that mj will then be less 
than mi is nearly 1/2; and the probability that niz will then be greater than 
nil is nearly 1 /2. There being six arrangements of the m's, the probabihty of 
the median is nearly Sha/ Vr. This is about 87 per cent of the approximate 
probabihty, 2 ^IZhaj Vir, of the arjithmetic mean. The exact probabihty, 
— ^according to the Gaussian law — ^that the median of 21/ + 1 measure- 
ments will he in (o — a, a + a) is 

In this expression, and thus its square become neghgible when, for any 
given V, a is made sufficiently small. 
By StirUng's formula,* 

1.2-3'-'j' = j'! = -J2^p'e-''+""', < fl < 1. 

Hence, if P„ is the probabihty that the arithmetic mean will he in 
{a — a, a + a), 

* See, for example, Broggi, Traits des Assurances sur !a Vie (1907), p. 54. 
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¥ = -v|^^^e*'^-*''*-(l + «), 



where lim e = 0, < ^ < 1, < e' < 1. 

a=0 

Theorem 5. The probability of the median of an odd number, 2v + 1, 
of measurements is less than that of their arithmetic mean — under the Gaussian 
law — and if v is made sufficiently large, and then a taken small enough, the 
ratio of PJ to Pa can be made as near V2/ Vx = .7979 as we please. 

Thus, with a large number of measurements, the probabiUty of the 
median falls about 20 per cent short* of that of the arithmetic mean. 

5. The Probability of the Geometric Mean. 

By the geome tric m ean of two positive measurements, Wi and m^, 
will be meant + Vm^; but the negative radical will be taken if both are 
negative; and the mean will not be regarded as defined, if they have unlike 
signs. Likewise, for n measurements, the geometric mean will be regarded 
as positive, if all measurements are positive; negative, if all measurements 
are negative; otherwise, undefined. 

Theorem 6. The probability of the geometric mean is less than that of 
the arithmetic mean, under the Gaussian law. 

This can be proved for the three cases: a > 0, a = 0, a < 0. In the 
case of two measurements, when a > 0, the field of integration for the 
geometric mean is bounded by two equilateral hyperbolas, tangent to the 
boimdaries of S — see figure — at Ti and T^; and extending out the second 
quadrant, and down the fourth quadrant, asymptotic to the Unes, y = a, 
and X = a. The segment, T1T2, is the only straight Une segment with 
slope, unity, joining the two hyperbolas, and having a length as great as 
2 v/2a. If the axes are rotated through 45°, and then integration is per- 
formed first with respect to x; the integral with y fixed and not zero — ^will 
be less for the geometric mean than for the arithmetic mean. In n di- 
mensions, the proof is facilitated, by translating to (a, a • • • a) as a new 
origin, using the " surfaces " upon which the points have positive co- 
ordinates, and showing that if (xi, Xi, • ■ ■ x„) Ues on one surface, 
(xi + 2a, X2 + 2cx, ■ ■ • x„ + 2a) falls " outside " the region bounded by 
the two surfaces. 

6. The Probability of the Weighted Mean. 
By a weighted mean of n measurements is meant. 



* This does not necessarily discredit the use of the median in economic, biological or other 
investigations. Only Gaussian distributions are being considered in this article. 
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Pimx + P2W2 + • • • + p„OT„, 

where the p's are given constants such that 

Pi + P2 + • • • + p„ = 1. 

Theorem 7. // n measurements are subject to the Gaussian law, with 
'precision constants, hi, hi, • • • hn, respectively, then any weighted mean is also 
subject to the Gaussian law,* with precision constant, H, where 

m-^\h)- 

This H takes its greatest value when 





hi" 


h' 


K' 




P' = m'' 


^^=2/1^' • 


■■' P''--2h^' 


and in this case. 









and the probability of this weighted mean is greater than that of any other 
weighted mean. 

The proof of this theorem f involves the use of an orthogonal transforma- 
tion which can be set up with zero coefficients in the same places as in the 
orthogonal transformation indicated for the arithmetic mean. Now H 
has a maximum when ^{plhY has a minimum. The case, p„ = 0, can be 
considered separately. Otherwise, 

Pn = I - Pl - Pi — •■• - Pn-l 

can be inserted, and the minimum located by setting the first partial de- 
rivatives equal to zero. For this point, an actual minimum and least value 
will occvir; since 

It should be noted that this weighted mean has not been proved to be 
the " most probable value." In fact, since this weighted mean, W, is 
subject to the Gaussian law, there are constants, 6, a little less than unity, 
— as has been proved in Theorem 3 — such that bW has a greater probabihty 
than W. 



* Cf . Czuber, Wahrscheinlichkeitsrechnung, I, p. 260. 

tFor a generalization of this theorem, see: Dodd, "The Least Square Method grounded 
with the aid of an Orthogonal Transformation," Jahresbericht der Deutschen Mathematiker- 
Veremigung, 21. Band (1912), p. 177. 



198 EDWARD L. DODD. 

Corollary. — The probability of the arithmetic mean of n measurements 
■with the same precision, h, is greater than the probability of any other linear 
homogeneous function of the measurements with constant coefficients whose 
sum is unity. 

For, in the first place, the arithmetic mean is the weighted mean for 
which each weight is 1/n. This weight, 1/w, is the most favorable weight 
for each measurement, when all the h's are equal. The general formula, 

m = -Lh^ 
becomes in this case, 

m = nh^ 
as foimd before. 



